GH-48897: [C++] Benchmark and optimize CountSetBits by pitrou · Pull Request #48898 · apache/arrow

pitrou · 2026-01-19T12:35:44Z

Rationale for this change

Counting the set bits in a null bitmap is an operation that comes often, it can be useful to get a more precise idea of its performance.

What changes are included in this PR?

Add a benchmark for CountSetBits.
Hand-unroll its inner loop for better performance as otherwise the compiler may not respect the nested loop hint.

Local results (AMD Zen 2):

------------------------------------------------------------------------------
Benchmark                    Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------
CountSetBits/16           5.27 ns         5.27 ns    133267114 bytes_per_second=2.82991Gi/s
CountSetBits/1024         35.1 ns         35.1 ns     19960309 bytes_per_second=27.178Gi/s
CountSetBits/131072       3703 ns         3702 ns       184743 bytes_per_second=32.9698Gi/s

Local results (Intel(R) Core(TM) Ultra 7 255H):

------------------------------------------------------------------------------
Benchmark                    Time             CPU   Iterations UserCounters...
------------------------------------------------------------------------------
CountSetBits/16           2.45 ns         2.45 ns    285392946 bytes_per_second=6.08012Gi/s
CountSetBits/1024         28.9 ns         28.9 ns     23618777 bytes_per_second=33.0086Gi/s
CountSetBits/131072       3490 ns         3489 ns       198472 bytes_per_second=34.9862Gi/s

Are these changes tested?

By running said benchmark manually (and by Continuous Benchmarking).

Are there any user-facing changes?

No.

GitHub Issue: [C++] Add benchmark for CountSetBits #48897

pitrou · 2026-01-19T12:39:30Z

@wgtmac @zanmato1984

pitrou · 2026-01-19T15:17:18Z

Also @AntoinePrv FYI

raulcd · 2026-01-21T09:15:45Z

@ursabot please benchmark

pitrou · 2026-01-21T09:21:56Z

Hmm, it seems performance is behind the expected theoretical throughput.

From Agner Fog's instruction tables, I see that AMD Zen 2 should be able to sustain 4 POPCNT operations/cycle (reciprocal throughput = 0.25), i.e. 32 bytes/cycle on 64-bit ints.

pitrou · 2026-01-21T09:36:23Z

Ok, the nested for-loop is un-nested by gcc 15.2.0...

pitrou · 2026-01-21T09:56:51Z

Updated benchmark numbers after I hand-unrolled the loop.

pitrou · 2026-01-21T10:00:15Z

@github-actions crossbow submit -g cpp

zanmato1984

+1

github-actions · 2026-01-21T10:02:50Z

Revision: d0f45cf

Submitted crossbow builds: ursacomputing/crossbow @ actions-cdbe33a753

Task	Status
example-cpp-minimal-build-static
example-cpp-minimal-build-static-system-dependency
example-cpp-tutorial
test-build-cpp-fuzz
test-conda-cpp
test-conda-cpp-valgrind
test-debian-12-cpp-amd64
test-debian-12-cpp-i386
test-debian-experimental-cpp-gcc-15
test-fedora-42-cpp
test-ubuntu-22.04-cpp
test-ubuntu-22.04-cpp-20
test-ubuntu-22.04-cpp-bundled
test-ubuntu-22.04-cpp-emscripten
test-ubuntu-22.04-cpp-no-threading
test-ubuntu-24.04-cpp
test-ubuntu-24.04-cpp-bundled-offline
test-ubuntu-24.04-cpp-gcc-13-bundled
test-ubuntu-24.04-cpp-gcc-14
test-ubuntu-24.04-cpp-minimal-with-formats
test-ubuntu-24.04-cpp-thread-sanitizer

rok · 2026-01-21T11:02:44Z

@ursabot please benchmark

pitrou · 2026-01-21T11:12:46Z

@rok I have deleted the branch, so I'm not sure that can work?

rok · 2026-01-21T11:13:29Z

I see the event on kubernetes, but the github api token was expired so it couldn't post back.
Edit: you're probably right about the closed PR though.

rok · 2026-01-21T11:15:23Z

Trying on #48907

conbench-apache-arrow · 2026-01-21T18:36:39Z

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit ed35594.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 10 possible false positives for unstable benchmarks that are known to sometimes produce them.

github-actions bot added Component: C++ awaiting review Awaiting review labels Jan 19, 2026

wgtmac approved these changes Jan 20, 2026

View reviewed changes

github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jan 20, 2026

pitrou force-pushed the gh48897-countsetbits-benchmark branch from 08383d7 to 9921e9d Compare January 21, 2026 09:22

apacheGH-48897: [C++] Add benchmark for CountSetBits

d0f45cf

pitrou force-pushed the gh48897-countsetbits-benchmark branch from 9921e9d to d0f45cf Compare January 21, 2026 09:43

pitrou changed the title ~~GH-48897: [C++] Add benchmark for CountSetBits~~ GH-48897: [C++] Benchmark and optimize CountSetBits Jan 21, 2026

zanmato1984 approved these changes Jan 21, 2026

View reviewed changes

pitrou merged commit ed35594 into apache:main Jan 21, 2026
54 of 55 checks passed

pitrou removed the awaiting committer review Awaiting committer review label Jan 21, 2026

pitrou mentioned this pull request Jan 21, 2026

[C++] Add benchmark for CountSetBits #48897

Closed

pitrou deleted the gh48897-countsetbits-benchmark branch January 21, 2026 10:35

Comments

Conversation

pitrou commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

pitrou commented Jan 19, 2026

Uh oh!

pitrou commented Jan 19, 2026

Uh oh!

raulcd commented Jan 21, 2026

Uh oh!

pitrou commented Jan 21, 2026

Uh oh!

pitrou commented Jan 21, 2026

Uh oh!

pitrou commented Jan 21, 2026

Uh oh!

pitrou commented Jan 21, 2026

Uh oh!

zanmato1984 left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 21, 2026

Uh oh!

Uh oh!

rok commented Jan 21, 2026

Uh oh!

pitrou commented Jan 21, 2026

Uh oh!

rok commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rok commented Jan 21, 2026

Uh oh!

conbench-apache-arrow bot commented Jan 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pitrou commented Jan 19, 2026 •

edited

Loading

rok commented Jan 21, 2026 •

edited

Loading